Cohesion and Repulsion in Bayesian Distance Clustering
نویسندگان
چکیده
Clustering in high-dimensions poses many statistical challenges. While traditional distance-based clustering methods are computationally feasible, they lack probabilistic interpretation and rely on heuristics for estimation of the number clusters. On other hand, model-based techniques often fail to scale devising algorithms that able effectively explore posterior space is an open problem. Based recent developments Bayesian clustering, we propose a hybrid solution entails defining likelihood pairwise distances between observations. The novelty approach consists including both cohesion repulsion terms likelihood, which allows cluster identifiability. This implies clusters composed objects have small "dissimilarities" among themselves (cohesion) similar dissimilarities observations (repulsion). We show how this modelling strategy has interesting connection with existing proposals literature as well decision-theoretic interpretation. proposed method efficient applicable wide variety scenarios. demonstrate simulation study application digital numismatics.
منابع مشابه
cohesion and cohesive devices in a contrastive analysis between ge and esp texts
the present study was an attempt to conduct a contrastive analysis between general english (ge) and english for specific purposes (esp) texts in terms of cohesion and cohesive devices. to this end, thirty texts from different esp and ge textbooks were randomly selected. then they were analyzed manually to find the frequency of cohesive devices. cohesive devices include reference, substitution, ...
15 صفحه اولthe clustering and classification data mining techniques in insurance fraud detection:the case of iranian car insurance
با توجه به گسترش روز افزون تقلب در حوزه بیمه به خصوص در بخش بیمه اتومبیل و تبعات منفی آن برای شرکت های بیمه، به کارگیری روش های مناسب و کارآمد به منظور شناسایی و کشف تقلب در این حوزه امری ضروری است. درک الگوی موجود در داده های مربوط به مطالبات گزارش شده گذشته می تواند در کشف واقعی یا غیرواقعی بودن ادعای خسارت، مفید باشد. یکی از متداول ترین و پرکاربردترین راه های کشف الگوی داده ها استفاده از ر...
ahp algorithm and un-supervised clustering in auto insurance fraud detection
this thesis is a study on insurance fraud in iran automobile insurance industry and explores the usage of expert linkage between un-supervised clustering and analytical hierarchy process(ahp), and renders the findings from applying these algorithms for automobile insurance claim fraud detection. the expert linkage determination objective function plan provides us with a way to determine whi...
15 صفحه اولReconciliation of Unsupervised Clustering, Segmentation and Cohesion
This extended abstract examines the progress of a project on unsupervised language learning, and focuses on two different approaches to segmentation, as well as how cohesion may be generalized from it definitive morphosyntactic instantiation. It is intended as a discussion paper, and outlines the specific hypotheses currenlty being tested.
متن کاملAn Incremental Text Segmentation by Clustering Cohesion
This paper describes a new method, called IClustSeg, for linear text segmentation by topic using an incremental overlapped clustering algorithm. Incremental algorithms are able to process new objects as they are added to the collection and, according to the changes, to update the results using previous information. In our approach, we maintain a structure to get an incremental overlapped cluste...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of the American Statistical Association
سال: 2023
ISSN: ['0162-1459', '1537-274X', '2326-6228', '1522-5445']
DOI: https://doi.org/10.1080/01621459.2023.2191821